Speech recognizer for voice control of mobile telephone
نویسندگان
چکیده
Infovox is marketing a speaker-dependent, pattern-matching word recognition system, developed at KTH. The algorithms in the system have been modified for noise immunity, and performance has been evaluated in moving cars. The main problems were word detection and noise compensation. After simulations we decided to use a close-talking microphone and a "noise addition" method, where we added the measured noise in the moving car to the reference patterns recorded in a parked car. Using this method, the recognition rate was improved from 69% to 97% on a ten-word vocabulary using the best microphone. A more extensive test was performed on the modified recognition system using two cars and twelve speakers, seven male and five female. Most of them were naive speakers. The twenty-word vocabulary contained some confusable words and was trained in a parked car. During 98 sessions, 1,960 words were read under different conditions with an average recognition rate of 86%. With closed windows at 90 km/h the mean was 91%. An open window at the same speed decreased the result to 82%.
منابع مشابه
Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations
This paper describes a novel speech-interface function, called “speech spotter”, which enables a user to enter voice commands into a speech recognizer in the midst of natural human-human conversation. In the past, it has been difficult to use automatic speech recognition in human-human conversation since it was not easy to judge, from only microphone input, whether a user was speaking to anothe...
متن کاملSpeech Spotter: On-demand Speech Recognition in Human-Human Conversation on the Telephone or in Face-to-Face Situations / Masataka Goto
This paper describes a novel speech-interface function, called “speech spotter”, which enables a user to enter voice commands into a speech recognizer in the midst of natural human-human conversation. In the past, it has been difficult to use automatic speech recognition in human-human conversation since it was not easy to judge, from only microphone input, whether a user was speaking to anothe...
متن کاملReal-time telephone transmission simulation for speech recognizer and dialogue system evaluation and improvement
Recognizer performance in telephone-based spoken dialogue systems may be strongly affected by the transmission channel. In order to investigate the impact of different parts of the transmission channel in more detail, a simulation model is presented. It implements all transmission characteristics of modern telephone networks, based on instrumentally measurable values as they are used by network...
متن کاملApplication of isolated word recognition to a voice controlled repertory dialer system
In this paper we describe a speaker trained, voice controlled, repertory dialer system. The main elements of tile system include: 1. A real-time speech analyzer that detects the presence of speech on the input line, and analyzes the speech to give features appropriate for a word recognizer. 2. An isolated word recognizer that decides which of a set of words was spoken. 3. A voice response syste...
متن کاملUser Interface Design for Voice Control Systems
A voice control system converts spoken commands into control actions, a process which is always imperfect due to errors of the speech recognizer. Most speech recognition research is focused on decreasing the recognizers’ error rates; comparatively little effort was spent to find interface designs that optimize the overall system, given a fixed speech recognizer performance. In order to evaluate...
متن کاملVoice Activity Detection Using Speech Recognizer Feedback
This paper demonstrates how feedback from a speech recognizer can be leveraged to improve Voice Activity Detection (VAD) for online speech recognition. First, reliably transcribed segments of audio are fed back by the recognizer as supervision for VAD model adaptation. This allows the much stronger LVCSR acoustic models to be harnessed without adding computation. Second, when to make a VAD deci...
متن کامل